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THE  PROBLEM 

To  determine  the  fastest  and  most  valid  method  for  examining 
the  speech  intelligibility  achieved  by  an  individual  or  a  communi¬ 
cations  system. 

FINDINGS  :  ,  . 

One  method  recently  suggested  for  sentence  intelligibility 
testing  is  to  ask  the  listener  not  to  write  down  the  words  he  hears,  but 
to  simply  check  the  ordinal  number  of  the  sentence  from  a  printed 
list  of  ten  sentences.  This  is  fast  and  easy  to  score,  but  a  direct 
comparison  between  this,  method  and  oiie  in  which  the  subject 
actually  writes  down  the  words  heard  reveals  that  there  is  no  cor¬ 
relation  between  the  two  tasks.  Thus  the  task  of  sentence  identi¬ 
fication  cannot  be  used  where  a  valid  measure  of  intelligibility  is 
desired. 

APPLICATION 

For  communications  engineers  involved  in  determining  figure- 
of-merit  for  a  system,  and  for  oto-audiologists  seeking  to  deter¬ 
mine  the  speech  intelligibility  of  which  a  particular  person  is 
capable. 
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ABSTRACT 


This  study  quantified  the  differential  effects  of  white  noise 
masking  on  the  intelligibility  of  sentences  vs  the  identification  of 
a  sentence  from  a  closed  set.  Twenty  so-called  "synthetic 
sentences"  constructed  by  Speaks  and  Jerger  for  intelligibility 
testing,  but  containing  very  little  meaning,  were  mixed  with  white 
noise  and  presented  to  64  Navy  enlisted  men  who  were  asked,  in 
one  case,  to  write  down  as  many  of  the  words  as  they  could  under¬ 
stand  in  each  sentence,  and  in  the  second  case,  simply  to  select 
the  correct  sentence  from  a  printed  list  of  10  sentences.  These 
tasks  were  performed  with  the  speech  material  always  at  60  dB 
SPL,  but  with  the  white  noise  raised  in  ten  steps  from  60  to  94 
dB.  The  white  noise  tended  to  obscure  the  low-intensity  high- 
frequency  consonant  discriminations  generally  assumed  to  be 
necessary  for  intelligibility,  but  at  all  comparable  signal/noise 
ratios  left  relatively  untouched  the  perception  of  pitch  contour  and 
other  prosodic  parameters  characterized  by  high  intensity  and  low 
frequency,  which  made  sentence  identification  possible,  even  in 
the  absence  of  any  intelligibility  whatever.  A  zero  correlation 
between  the  two  tasks  revealed  that  sentence  identification  from  a 
closed  set  is  a  task  unrelated  to  an  understanding  of  the  words  in 
a  sentence. 


Ill 


DIFFERENCES  BETWEEN  WORD  INTELLIGIBILITY  AND  SENTENCE 
IDENTIFICATION  RESPONSES  TO  ’’SYNTHETIC"  SENTENCES 

INTRODUCTION 


Speech  intelligibility  has  been  tra¬ 
ditionally  assessed  in  the  clinic  by  a 
wide  variety  of  speech  materials,  such 
as  monosyllabic  words,  nonsense 
syllables,  and  rhyming  consonants.  The 
assumption  is  often  made  that  intelligi¬ 
bility  of  these  materials  is  dependent 
upon  frequency  (spectral)  discrimina¬ 
tions^’^’^  suggest  that  this  assumption 
may  be  too  restricted  and  that  temporal 
characteristics  or  patterns  may  also 
underlie  the  intelligibility  of  conversa¬ 
tional  speech.  Sentence  tests  have 
been  used  to  provide  the  sufficient 
time  duration  necessary  for  the  utili¬ 
zation  of  the  temporal  characteristics 
of  a  speech  message. 

Limitations  of  sentence  tests  for 
intelligibility  measurement  have  been 
noted.  These  limitations  consist 

primarily  of  the  use  of  an  open  message 
set;  the  unreliability  of  the  response 
which  is  influenced  by,  amoi^  other 
things,  the  subject's  attitude  toward  the 
test  situation,  his  willingness  to  guess, 
and  the  difficulty  in  scoring  responses. 
Speaks  and  Jerger^  suggested  a  method 
of  using  sentences  which  reduce  these 
limitations,  consisting  of  having  the 
subject  simply  identify  the  sentence 
from  a  closed  set  of  "synthetic  sentenc¬ 
es.  "  The  synthetic  sentences  were  con¬ 
structed  of  random  Or  conditional  word 
probabilities,'  so  that  relative  informa¬ 
tional  content  and  sentence  length  could 
be  controlled. 


Jerger  suggested  that  the 

"filtering  characteristic"  of  the  Syn¬ 
thetic  sentence  identification  response 
is  fimdamentally  different  from  that  of 
the  word  intelligibility  response.  In¬ 
telligibility  was  said  to  require  dis-  ' 
crimination  among  the  relatively  high- 
frequency  consonant  sounds.  Whereas 
in  the  identification  response  the  low- 
frequency  vowel  information  can  be 
used' to  produce  a  correct  identification. 

The  present  investigation  attempts 
to  quantify  the  effects  of  and  study  the 
basis  for  differential  "filtering  char¬ 
acteristics.  " 


METHOD 

1.  Preparation  of  Test  Material. 

Two  different  synthetic  sentence 
lists  incorporating  a  third-order  con¬ 
ditional  probability  S,  Appendix  A-  con¬ 
sisting  of  ten  sentences  each,  were 
used  in  this  study.  Both  were  recorded 
on  a  Wollensak  Model  1500  tape  re¬ 
corder;  with  RMS  peaks  at  VU=0. 

There  was  a  lO-sec  pause  on  the  tape 
between  sentences.  The  recorded  sen¬ 
tences  were  played  through  a  clinical 
audiometer  at  a  constant  intensity  of 
60  dB  sound  pressure  level  (SPL)  and 
for  sentences  1-10  of  each  list;  white 
noise  at  60,  70,  80;  82,  84,  86,  88, 

90,  92,  and  94  dB,  respectively,  was 
electronically  mixed.  The  sentences' 
and  noise  were  fe-reeorded  together 
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on  a  single  channel  of  a  second 
Wollensak  recorder. 

2.  Response  Sheets. 

In  the  first  test, -the  intelligibility  re- 
sponse.,required  that  the  subject  write 
down  pn  a, sheet  of  paper  containing  10 
blank  lines  as  many  words  of  a  sentence 
as  he  could  understand  from  the  pre¬ 
sentation  of  the  tape-recording.  While 
in  the  second  test,  the  identification 
response  required  that  the  subject 
simply- check  off  the  number  of  a  par¬ 
ticular  sentence  out  of  the  list  of  ten  ,, 
synthetic  sentences  printed  on  the  re¬ 
sponse  sheet. 

3 .  Subjects. 

Subjects  were  screened  for  normal 
hearing  (re:  Amer.  Nat’l.  Standards 
Inst. ,  1969)  prior  to  testing;  64  Navy 
enlisted  men  between  the  ages  of  17-24 
years  were  used. 

4.  Test  Presentation. 

The  subjects  were  tested  in  groups 
with  TDK -3  9  earphones  in  MX -41  cush¬ 
ions  on  the  R,  ear.  The  L  ear  was 
covered  with  a  dummy  earphone.  All 
tapes  were  presented  by  a  Wollensak 
Model  1500  tape  recorder  in  a  sound- 
absorbent  room  with  low  ambient  noise. 

Prior  to  each  identification  response 
run,  each  subject  was  made  acquainted 
with  the  sentences  he  would  later  iden¬ 
tify.  The.  ten  sentences  he  would  hear 
in  the  identification  test  were  randomly 
scrambled  and  presented  twice  in  suc¬ 
cession  at  an  intensity  level  of  7,0  dB 
SPL,  with  no  noise  competition.  The 
sentence  list  was  then  presented  at 


the  steadily  increasing  .signal/noise 
(S/N)  ratios.  ,  , 

The  order  of  response  was  counter¬ 
balanced  so  that  half  of  the  subjects 
took  the  intelligibility  test  first  and 
half  the  identification  test.  The  lists 
were  also  counter-balanced  so  that 
each  was  used  for  each  response  an 
equal  number  of  times.  There  was 
no  especial  time  lapse  between  the  two 
tests.. 

5.  Scoring. 

a)  Intelligibility  Test:  A  subject's 
score  for  any  sentence  was  the  . num¬ 
ber  of  correct  words.  Mean  score  for 
each  sentence  was  computed  and  ex¬ 
pressed  as  a  percentage  of  the  total 
possible  correct  words  at  each  S/N 
ratio. 

b)  Identification  Response:  A 
subject's  score  was  either  correct  or 
incorrect  for  any  sentence.  Per  cent 
correct  score  for  each  sentence  was 
the  percentage  of  ail  the  men  who  had 
correctly  identified  that  sentence. 

RESULTS 

There  was  a  mean  correct  word 
intelligibility  score  of  18.1  words 
{S.D.=7.07)  for  all  10  sentences,  and 
a  mean  correct  sentence  identification 
of  7.5  sentences  (S.D.=1.61). 

The  Pearson  product-moment  be¬ 
tween  the  responses,  utilizing  all  S/N 
ratio  was  .  04. 

Table  I  gives  the  mean  per  cent 
scores  for  each  S/N;  across  all  S/N's, 
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Table  I.  The  meaii  correct  word  intelligibility  and  sentence  identification 
responses  at  the  signal-to-noise  ratios  investigated,  expressed  as  a  per 
centof'the  total  possible  correct  responses  respectively. 

'  -  '  NOTE:  Speech  always  at  6()  dB  SPLV  ' 


’■  Speech-to-NOise  ' 

-  Ratio  in  Decibels 

'  Pef  Cent  Correct  Word 
Intelligibility 

Per  Cent  Identifi¬ 
cation  Sentence 
Correct 

0 

90.5 

93.5 

-10 

68.5 

93.5 

-20 

49.0 

91.0 

-22 

33.0 

91 . 0 

^  -24 

23.0 

75.0 

-26 

12.0 

87.0 

-28 

.12.5  ,  '• 

78.  0 

-30 

2.0 

47.5 

-32 

1-0 

45.0 

-34 

1.0 

30.0 

MEAN-  29.2 

73.1 

N:  64  • 

the  mean  per  cent  correct  word  intelligi¬ 
bility  score  was' 2  9. 2,  the  mean  per 
cent  sentence  identification  score  was 
73.1  (t  test  significant  at  .  01  level). 

Fig.  1  shows  that  the  S/N  at  the  50%' 
correct  point  was  about  -18  dB  for  word 
intelligibility,  about  -32  dB  for  sentence 
identification. 


DISCUSSION 

The  zero  correlation  between  the  two 
responses  to  the  same  sentences  im¬ 
plies  that  the  two  response  measures 
are  different  procedures  and  are  not 
necessarily  sensitive  to  the  same  para¬ 
meters  of  speech' perception.  The 
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Per  Ceni  Correcl  Response 


Signol/Noise  Ratio  in  DB 

Fig,  L  Difference  between  the  two  tests  at  the  50% 
correct  performance  mark, 

significant  difference  between  the  per¬ 
centage  mean  correct  scores  for  the 
two  responses  further  substantiates  the 
fundamental  difference  suggested  by 
the  zero  correlation.  Furthermore, 
the  difference  of  about  14  dB  between 
the  two  tests  at  the  50%  correct  per¬ 
formance  is  in  general  maintained  at 
all  levels  of  performance,  as  is  shown 
by  the  continued  separation  of  the 
curves  in  Fig.  1. 

The  basis  for  the  differential  sensi¬ 
tivity  of  the  two  responses  to  the  im¬ 
portant  variables  of  speech  perception 
might  best  be  pursued  along  psycho- 
linguistic  lines.  The  separation  of  the 
curves  in  Fig.  1  indicates  that  the 
identification  response  is  quite  gener¬ 
ally  more  resistant  to  white  noise 
interference.  This  greater  resistance 
may  reside  in  the  possibility  that  cor¬ 
rect  identification  of  an  entire  sentence, 
within  such  a  closed  message  set,  is 
possible  with  correct  perception  of  as 


little  as  one  word,  or  even  part  of  a 
word,  unique  to  that  sentence.  Further, 
the  overall  intonation  pattern  of  each 
sentence  is  unique  in  a  closed  set. 

Since  the  subject  has,  then,  at  most, 
only  ten  intonation  patterns  to  discrimi¬ 
nate,  it.  is  possible,  and,  from  the  data 
in  Fig.  1  even  probable,  that  sentences 
at  some  S/N  ratios  can  be  identified 
correctly,  without  even  one  word  being 
correctly  perceived.  On  the  other  hand, 
woi^  intelligibility  certainly  requires 
discrimination  of  the  encoded  phonemic 
.elements. 

A  short  excursion  into  some  pro¬ 
posed  parameters  underlying  the  speech 
code  may  shed  light  here.  It  has  been 
demonstrated^’^  that  encoded  informa¬ 
tion  about  underlying  consonantal 
phonemic  structure  is  carried,  in  part, 
by  the  less  intense,  higher  frequency 
second  formant  and  its  concomitant 
transitions.  The  first  formant  is 
thought  to  carry  the  prosodic  features 
of  an  utterance.  Because  of  the  sharply 
asymmetrical  upward  spread  of  mask¬ 
ing,  high  levels  of  white  noise  (with 
equal  intensity  at  all  frequencies)  might 
be  expected  to  differentially  and  more 
severely  degrade  the  low-intensity, 
high-frequency  second  formant  transi¬ 
tions  than  the  more  intense  and  lower 
frequency  first  formant.  The  end  effect 
of  the  differential  masking  may  have 
shown  up  in  the  present  results.  Cor¬ 
rect  prosodic  feature  perception,  then, 
may  be  all  that  is  needed  to  identify  a 
sentence  correctly  in  a  closed  message 
set. 

It  might  be  added  that  if,  indeed, 
perception  of  the  first  formant  is  im¬ 
portant  to  identification  of  a  sentence 
in  a  closed  set,  then  distorting  the 
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temporal  cues  afforded  by  the  first 
formant  should  reduce  identification 
score;  perhaps  then  other  parameters  of 
the  speech  code  would  take  on  added 
importance. 

If  we  assume  that  consonants  are  im¬ 
portant  carriers  of  intelligibility,  then 
the  temporal  parameters  of  the  second 
formant  transitions  should  be  studied 
further.  The  synthetic  sentences  are 
heavily  loaded  with  vowel  information^ 
and  do  not,  therefore,  depend  strongly 
upon  the  perception  of  the  second  for¬ 
mant  transitions,  i.e. ,  a  person  may 
shift  almost  completely  to  a  vowel  de¬ 
tection  strategy.  Since  consonant  de¬ 
tection  strategies  seem  to  be  the  more 
important  in  designing  a  measure  of 
communicative  handicap,  research  in 
and  development  of  tests  of  perception 
strategies  seem  warranted. 
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APPENDIX  A 


Third-Order  "Synthetic"  Sentence  Message  Sets  (From  Jerger,  1970) 


1.  SMALL  BOAT  WITH  A  PICTURE  1 

HAS  BECOME 

2.  BUILT  THE  GOVERNMENT  WITH  2 
THE  FORCE  ALMOST 

3.  GO  CHANGE  YOUR  CAR  COLOR  3 

IS  RED 

4.  FORWARD  MARCH  SAID  THE  BOY  4 
HAD  A 

5.  MARCH  AROUND  WITHOUT  A  5 

CARE  IN  YOUR 

6.  THAT  NEIGHBOR  WHO  SAID  6 

BUSINESS  IS  BETTER 

7.  BATTLE  CRY  AND  BE  BETTER  7 

THAN  EVER 

8.  DOWN  BY  THE  TIME  IS  REAL  8 

ENOUGH 

9.  AGREE  WITH  HIM  ONLY  TO  FIND  9 
OUT 


OWN  LOT  IN  YOUR  AMERICAN 
FOOD  HAS 

IS  MEETING  IN  THE  COLLEGE  OF 
THE 

US  WITHOUT  FIRST  THOUGHT 
WAS  VERY  NICE 

AGREE  WITH  US  IT  ONLY  SHALL 
COOK 

WITH  HUMAN  NATURE  CAN  BE 
MET  AT 

IN  OUR  DIFFERENT  NAME  BE¬ 
CAUSE  HE  CAN’T 

MARCH  AROUND  THE  TIME  WAS 
EVERYTHING  THAT 

MORE  OF  IT  IS  CERTAIN  TO 
BE 

WE’LL  GO  TO  SCHOOL  TODAY 
WAS  THE 


10. 


1 


WOMEN  VIEW  MEN  WITH  GREEN 
PAPER  SHOULD 


10.  SO  ALLOW  US  TO  THE  SOUND 
AND 
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